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Abstract 

The random spht tree introduced by iDevrove ( 19991) is considered. 



We derive a second order expansion for the mean of its internal path 
length and furthermore obtain a limit law by the contraction method. 
As an assumption we need the splitter having a Lebesgue density and 
mass in every neighborhood of 1. We use properly stopped homogeneous 
Markov chains, for which limit results in total variation distance as well as 
renewal theory are used. Furthermore, we extend this method to obtain 
the corresponding results for the Wiener index. 
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1 Introduction 



The random split tree introduced by Devrove ( 19991 ) is a general tree model 



which for special choices of its parameters covers various random trees that are 
fundamental in Computer Science for their use as data structures, e.g. binary 
search trees, quadtrees, m-ary search trees, simplex trees, tries etc. Many 
characteristic quantities of these trees such as node depths, height, path length 
or other distance measures between nodes describe the complexity of algorithms 
that make use of the trees. In the probabilistic analysis of algorithms the 
asymptotic behavior of such quantities is studied for this reason. Whereas often 
such characteristic quantities are studied one by one for each tree Devroye's idea 
was to derive universal results valid for the whole class of his split tree model. 
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We recall the definition of the split tree from Devrove ( 19991 ). Four paramet 



ters 

b,s,so,si € Mo are given where 6 > 2 is the branching factor, s > is the 
vertex capacity and sq and si satisfy the two conditions 

< So < s, < bsi < s + 1 — Sq. 

Furthermore, a random vector V = {Vi, . . . ,Vb) G [0,1]^ with Ylk=i^k = 1 
is given. The random split tree of size n is obtained by distributing n balls 
to the nodes of the infinite 5-ary tree according to the following procedure. 
For a node u of the 6-ary tree let C{u) denote the number of balls already 
assigned to this node and N(u) be the number of balls associated to any node 
in the subtree rooted at this node. For each node u take an independent copy 
V^") = {Vi^\ . . . , V^^"'') of the random vector V. Initially, there are no balls (i.e. 
C{u) = for all u) distributed. The balls are added to the tree sequentially. 
Adding a ball to a tree rooted at u proceeds as follows: 

a) If u is not a leaf (i.e. C{u) < N{u)), choose child i with probability V^^\ 
increment N(u) by 1 and recursively add the ball to the subtree rooted 
at child i. 

b) If u is a leaf and C{u) = N{u) < s, then add the ball to u and stop. C(n) 
and N(u) are incremented by 1. 

c) If u is a leaf but C(n) = N{u) = s, we set N(u) = s + 1 and C{u) = sq, 
place So < s randomly selected balls at u, give si randomly selected balls 
to each of the b children of u and set C{v) = si = N{v) for all children v 
of u. After that, we add each of the remaining s + 1 — so — bsi > balls 
one by one randomly and independently to the subtree rooted at child i 

(u) 

with probability by applying the procedure recursively. 

Usually, one assumes that Vi = Vi =: V for all i = 2, . . . ,b where V is called 
the splitter and its distribution is called the splitting distribution. By = it is 
denoted that left and right hand side have identical distributions. Whenever 
the functional under consideration is independent of the tree ordering, this 
assumption does not mean any loss of ge nerality. This c an be seen by a random 
permutation argument, already stated in lDevrovel(|l999l ). In this paper we need 
some additional assumption: 

General assumption: Throughout this paper we assume that the distribution 
of V has a Lebesgue density fy and that for the distribution function we have 
Fv{x) < 1 for all x < 1. 

As mentioned in the beginning, the random split tree models many common 
random trees. For instance, choosing s = sq = b — 1 for some 5 > 2, si = and 
V = min{C/i, . . . , Uh-i} where Ui, . . . , U^-i are independent random variables 
uniformly distributed on [0, 1] one gets the random b-ary search tree. The 
random median-of-(2/c + 1) binary search tree can be realized by setting b = 2, 
s = 2k, Sq = 1, si = k and V = median([/i, . . . , C/2fc+i)- Also some digital data 
structures are covered by the split tree model. For V uniformly distributed on 
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the deterministic set {pi, ■ ■ ■ ,Pb}, s = 1 and si = one obtains in the case 
So = the trie and in the case so = 1 the digital search tree. In Table 1 in 
Devrovel (|l999l 'l more examples of important tree models are listed with the 



corresponding choices of the parameters. 

The general assumption and with it the results of this paper hold true for many 
of these examples as random binary search trees, random 6-ary search trees, ran- 
dom quadtrees, random median-of-(2A;-|- 1) binary search trees, random simplex 
trees, (extended) AB trees and random m-grid trees. Whereas the results are 
not applicable to the common digital data structures as tries and digital search 
trees. 



The depth of the n-th ball in a random split tree, denoted by Dn, is the number 
of edges on the path from the ball to the root of the tree. The internal path 
length of balls in the split tree is the sum of all depths of balls and is denoted 
by Pn for the tree with n balls. Thus, we have 



n 

E 

k=l 



The asymptot ic expansion of th e expectation of Pn was i nvestigated for m-ary 



search trees in lMahmoudI for random quadtre es bvlFlaiolet et al. 

and for the inedian of ( 2k + l)-binary se arch tree by Chern and Hwana 



1995) 



2001 



andlRosld (|200lh . In iHolmgre 3 (|20inl ) the mternal path length of random 
split trees is considered under the assumption that the splitting distribution 
is non-lattice. The first term and an upper bound of the second term of the 
asymptotic mean are derived using renewal theory. 

Limit theorems for the distribu t ion of the p ath leng t h are proved for the random 
binary search tr ee in iRegnieil (Il989ll a nd iRosleil (|l99lh and for the random 



recursive tree in 



Dobrow and Filll (Il999l'). 



Using the contraction method, Neininger and Riischendorj ( 19991 . Theorem 5.1) 
showed a universal limit theorem for the internal path length of random split 
trees under the assumption that the asymptotic expansion of the expectation 
of the internal path length is of the form 



£'[P„] = din log n + + o{n) 



(1) 



as n — )■ oo. Therefore, it is of interest to characterize all splitting distributions 
providing an asymptotic expectation of the form ([1]). The first result of this 
paper is the following. 

Theorem 1.1. Let P„ denote the internal path length in a random split tree of 
size n with branching factor b where the one-dimensional marginal distribution 
V of the splitting vector fulfills the general assumption. Then there exists a 
constant Cp GH with 



E[Pn 



-n 



log n + CpU + o{n) 



as n ^ oo where fi = 



-bE[V log V]. 
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To s tate the result which follows from the combination of the limit theorem 
from iNeininger and Riischendori (I1999I ) with Theorem 11.11 we introduce some 
notation. By A4o.2 we denote the set of centered probability measures on H 
with finite second moments. We denote the distribution of a random variable 
X by C{X) or . The Wasserstein- metric £2 on A^o,2 is defined by 



£2(1^1,1^2) := mf{\\X - Y\\2 : C{X) = vi,C{Y) = V2} 



(2) 



where the -L2-norm ||-||2 is given by ||-'^||2 = For random variables 

X and Y we set £2{X^ Y) := £2i^iX), C{Y)). It is well known that convergence 

with respect to the metric £2 (denoted by — >) is equivalent to weak coiivergeii ce 
plus convergence of the second moments (see e.g. lBickel and FreedmanI (Il98ll )). 



Corollary 1.2. Let Pn denote the internal path length in a random split tree 
of size n where the one- dimensional marginal distribution of the splitting vector 
(Vi, . . . , Vfe) fulfills the general assumption. Define Xn := {Pn — E[Pn\)/n. Then 
the following holds true: 

a) As n ^ 00 we have £2{Xn,X) — )• where C{X) is the in A^o,2 unique 
solution of the fixed point equation 



b b 

x^5^VfcXW + i + -^l4logl4, 

fe=l ^ k=l 



where /u := -bE[VilogVi], £(XW) = C{X) for all k = 1,...,6 and 
X, , . . . , X(^) ,{Vi,...,Vb) are independent. 

b) In particular, the convergence in a) implies 

Var(P„ 



2 2 I / 2\ 

an + on 



with 



a 




^ 14 log 14 



-1 



1 i^-E^in 



k=l 



c) Exponential moments exist and converge, 

^[exp(AX„)] ^ ^[exp(AX)], A G E. 

d) For all k we have as n ^ 00, 

P{\Pn - E[Pn]\ > eE[Pn]) = 0(n-'=). 

Remark 1. 3. The tail bound given iii d) is knowi i not to be sharp in pa rticular 
examples. iMcDiarmid and Havward (1 19961 ) and iFill and Janso 3 (I2OO2I ) give a 
more precise bound for the random binary search tree. 
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The Wiener index of a random split tree is defined as the sum of the distances 
between all unordered pairs of balls, where the distance between two balls 
is given by the minimum number of edges connecting the nodes which are 
associated to the balls. For trees, the two dimensional vector consisting of the 
Wiener index and the internal path length suffices a recursion formula similar 
to that of the latter one. Using this recursion formula, iNeininger proved 
a limit theorem for the Wiener index of the random binary search tree and the 
random recu rsive tree by the use of the multivariate contraction theorem. In a 
final remark, INeininger mentioned that a limit theorem for the Wiener 

index of the general split tree can be proved in a similar way after determining 
the asymptotic expansion of its expectation sufficiently well. 
We prove this asymptotic expansion and use the contraction method to obtain 
the limit theorem for the Wiener index of random split trees which fulfil the 
general assumption. 

Theorem 1.4. Let Wn denote the Wiener index in a random split tree of size 
n with branching factor b where the one- dimensional marginal distribution V of 
the splitting vector fulfills the general assumption. Then there exists a constant 
G R with 



E[Wn] 



1 



-n 



' log n + Cu,n^ + o(n) 



as n 



oo where fi = —bE[V logV]. 



We denote by A^q 2 the set of centered probability measures on with finite 
second moments. The Wasserstein-metric £2 on the set 2 is defined similarly 
to the one-dimensional case. 

Theorem 1.5. Let {Wn,Pn) denote the vector consisting of the Wiener index 
and the internal path length of a random split tree of size n with branching 
factor b where the one- dimensional marginal distribution of the splitting vector 
(Vi, . . . , Vfj) fulfills the general assumption. Then the following holds true: 



a) We have as n ^ 00, 

'Wn-E[Wn] Pn-E[Pn 



n 



where iW, P) is the unique distributional fixed-point of the map T : A4q 2 
Ml 2 given for v € M\ 2 by 



with 



Tiu 



\i=l 



Vi 




where 

independent. 



V for X« 



(Xf.xf), and X'^^\...,X^''\D,Z are 
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b) In particular, the convergence in a) implies 

Var(VF'„) = cr^n^ + o(n^) 

with some constant o"^ > 0. 

Remark 1.6. The constant ^ = — 6-E[yiogy] in the first order terms of the 
expectations of the internal path length and of the W i ener in dex appears already 
in the results about the height and depth in bevrovel (|l999l ^. There, the explicit 
values of this constant for the individual splitting distributions are given in 
Table 2. 

Remark 1.7. Besides the internal path length for the balls considered here, there 
is also the internal path length for the nodes where the depths of all nodes 
are summed up. Since ther e can be u p to s balls in one node, these two 
path lengths may differ. In iHolmgren the relation between the two 

versions is investigated. Let denote the number of nodes in the random 
split tree with n balls. Assuming that the distribution of — log V is non-lattice, 
P{V = 1) = P{V = 0) = and 



E[Nn 



an + O 



n 



{log 



n 



(3) 



for some constant a > and e > 0, iHolmgrenI ^20ld ) showed that Theorem 
11.11 implies the similar asymptotic behavior for the internal path length for the 
nodes in that random split tree. This finally yields the general limit theorem 
for the internal path length for the nodes in split t r ees w hich additionally fulfil 
equation For instance, Mahmoud and Pittell ( 19891 ) showed the stronger 
result E[Nn] = an + 0{n^~'^) in the case of the 6-ary search tree. 
It seems that there are no results on the corresponding alternative version of 
the Wiener index in terms of the node-to-node distances. 

The internal path length and the Wiener index have been considered also for 
random trees that do not belong to the class of split trees. A universal limit 



law for the path length of simply generated trees is proved in Ijansoni (120031 ) 
where the limit distribution is given as a function of the Brownian excur- 
sion. Furthermore, the moments of the limit are derived. For the class of 
random increasing trees, which covers in particular the random recursive tree 
and the plane oriented recursive tree, the second o rder asymptotic of th e ex- 
pectation of the internal path leng th is derived in [Bergeron et al.l (| 19921 ) . In 
Munsonius and Riischendorfl (j2010l ) the asymptotic behavior of the expectation 
and a limit theorem for the internal path length of random 6-ary trees with 
weighted edges is proved. By special choices of the edge weights, the analogous 
results are obtained for the class of random linear recursive trees, which en- 
compasses in particular the random plane oriented recursive tree. Tail bounds 
for the Wiener index of rando m binary search trees have been considered by 
Ali Khan and Neiningei] (|2007l ). 

For a random split tree with n balls we denote by In = iln,!^ ■ ■ ■ ^ In ,b) the 
vector of the sizes of the subtrees, i.e. the number of balls assigned to nodes 
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in the subtrees, rooted at the children of the root. By the construction of 
the split tree it follows that In is conditionally given V^™°*^ = {vi^...^Vh) 
multinomial distributed M{n — SQ — bsi]Vi, . . . ,Vb)- Thus, under the assumption 

that Vi = Vi =: V for all i = 2, . . . , 6 we obtain 



P{In,i =k + si] 



dP^(x), 



(4) 



where we set ??„ := n — sq — bsi. Throughout this paper, Bin(m,x) denotes 
a random variable with binomial distribution with parameters m G IN and 
X G [0,1]. 

T he proo f s of T heorem [LT] and Theorem 11.41 are based on a method developed 
in Bruhn (|l996l ) for recurrences where the toll function is bounded. In Section 
[21 we recall definitions and results of iBruhnI (jl996l ) and extend his method to 
the case of an unbounded toll function. We check the conditions of this method 
in the case of the random split tree in Section [3l Section d] is devoted to the 
application in the case of the internal path length and the proof of Theorem 
11.11 In Section [5] we give the proofs of Theorem II. 41 and Theorem 1 1 . 5 1 concerning 
the Wiener index. 
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Nicolas Bro utin for helpful discussi o ns an d making a preliminary manuscript 
of the paper iBroutin and HolmgrenI (120111 ) on the internal path length of split 
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2 The setting of Bruhn 

Starting from recursion formulas of the form 



n-l 
k=0 



w here Vn }S a pr obability measure on{0,...,n — 1} for all n G W, the main idea 
of Bruhn (|l996 ll is to define a homogeneous Markov chain {St)teK with state 



space £ = { — logn : n G W} U {1} where the transition probabilities are given 
for n > by 



P{Si = X \ Sq = — log n) 



Un{{e for X G {- log(n - 1), . . . , - log 1} 

z^„({0}), for X = 1 



and P{Si = 1 I 5o = 1) = 1. Now, let a{ni) := iui{t \ St > -logni} be 
the stopping time when the Markov chain exceeds — logni for rii G M. Then, 
Bruhn proved the representation formula given in the following Lemma. (Since 
the PhD-the sis of Bruhn seems to be not available in English, the proofs of 
BruhnI (|l996l l are stated in Appendix [Bj) 
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We denote by Yt := St — St~i the increments of S. For x £ £ we write Px{-) 
in short for P{- | 5*0 = x) and correspondingly Ex[-] for the expectation with 
respect to the measure Px- We denote by Fx the distribution function of P^^~^ , 
i.e. Fx{y) = P{Si - x <y\ Sq = x). 

Lemma 2.1. Let Hn he a sequence of real numbers satisfying 

n.-l 

Hn = ^Un{{k])Hk+r{n) 
for some function r. Then it is for any ni G IN with the notations above 

(T(ni)-l 

if„ = ^_iog„iIexp(-S,(„^)) +^-iogn, r{exp{-St)). (5) 

i=0 

To analyze the Markov chain {St)t£f^ we consider in the following a general 
state space <f C R. 

Definition 2.2. The Markov chain (S'f)fGiNo ^-^ •^'^^^ AR-process (ap- 

proximate renewal) if the state space E has no lower bound, the increments 
Yf := St — St-i are strictly positive, Fx converges in distribution as x ^ — oo 
to a distribution function F, i.e. for all points t where F is continuous it is 

lim Fx{t)=F{t), 

a;— )■— oo 

andO< f tdF{t) < oo. 

For a G ]R_ we define : R ^ [0, 1] by Fa{t) := inf^<aF^(t) and F„ : E 
[0,1] by :=sup,<,F,(t). 

Definition 2.3. The set of distributions {Fx} fulfills the integrability condition 
^f 

lim / xdFa{x) = / xdF{x). 

a-*-ooJ J 

In the case of an AR-process, the theorem of dominated convergence implies 
that the integrability condition is equivalent to 

j xdFaix) < oo (6) 

for some a G R. 

The first summand in ([5]) can be handled by consid ering the dist ribution of 
S^(^ni)- The following key result is imp licitly given i n Rosier ( 200ll ) in a more 



general setting. The essential part of iRoslerl (120011 ) which gives the proof is 



stated in Appendix |X] in a self-contained way. For probability measures P and 
Q, let d'Ty{P,Q) denote their total variation distance. Moreover, we define 
T{d) := inf{t :St>d}. 
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Lemma 2.4. Let {St)teK be an AR-process which fulfills the integrability con- 
dition with a discrete state space £ . If there exist e > 0, xq £ ]R,_ and K > 
such that for all x,y < xq with \x — y\ < K we have 



dry {Pi\Py''] 



and lim inf Pz{S^(y) - y < K) > 0, (7) 

xo^—oo z<y<xo 



<2(l-e) 

then it holds for any a G R_ 

lim sup dTV f^f"'"',if"'°') =0. 

0^-'^x,y<xo ^ ' 



XQ- 



The asymptotic behavior of the second summand in ([5]) can be analyzed by 
using the elementary renewal theorem. Since the Markov chain (^St)t&. is not a 
renewal process, we couple it with three renewal processes using the functions 
F, Fa and F^- Because of the convergence lim^^-^-oo -^x(i) = the functions 
Fa and are again distribution functions. 

Considering the AR-process (St) from above, there exists a sequence of inde- 
pendent random variables (C/r)reiN uniformly distributed on [0, 1] such that 



Yt 



F-^ oUt 



for all t £ IN. 



For a G K, we define three renewal processes S!'"^ and S by sj"^ =5, 



So = So and the i.i.d. increments Yr"'\ Yji-""^ and Yr given by 



Yt 



(a) 



F- 



Ut, 



Y 



and 



Yr 



oUt. 



Thus, for alH G M we have nl"^ < St 



St-i < Y, 



(a) 



whenever St^^ < a. 



Moreover, for each t G IN the sequence Y^""^ is decreasing and Y^""^ is increasing 
as a — )• — oo. Both sequences converge almost surely to Yj-. 
Finally, we define the following stopping times for a, d G R: 



r{d) 
--^^\d) 



.(a) 



id) 



mi{t :St>d}, ^{d) 

mi{t:S'f''^>d], 7(")((i) 

inf{t : > 4, 7('^)(d) 

and 7(d) 



inf{t : St — So > d} , 
inf{t : ^J*^) 
inf{t : 5^ 



sr > d}, 



(a) 



^ 



sr > d}, 



mi{t : St-So>d}. 



Using the renewal process {St)teK, iBruhnI ( 19961 ) shows the following result. 
(The proof is given in AppendijiO) 

Lemma 2.5 ( Bruhn ( 19961 ). Lemma 3.4). Consider an AR-process (St) 
with the notations above. Then there exist a real number a* and a positive 
real number n(a*) such that for all measurable functions / : E, — )■ M,^, all real 
numbers y, z and all x £ £ with x < y < z < we have 



t{z)-1 
t=Tiy) 



< u{a^) ^ sup l{t). 

I t£(n— l,nl 
n=ly] ^ ' J 
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To investigate also recurrences where the toll function r is not bounded as it is 
for example in the case of the Wiener index, we complete the results of Bruhn 
by the following lemma and corollary. 

Lemma 2.6. It holds for all decreasing continuous functions Z : R — )• and 
any d G 1R_|_ 



lim E 

a—^—oo 



t=i 



lim E 

a— >— cxD 



7(<')(d) 



t=l 



E 



'lid) 

^l(St-Sc 



t=l 



< oo. 



Proof. First, we consider the sequence {Si""^). By the construction we know 
that for each s, t G IN the mapping a i— t- Y"/"^ and thus the mapping a i— )■ 



^{a) _ gyu) decreasing and converge almost surely to Yg and St — Sq as 
a — >• — oo. This yields that for d G IR, the mapping a i— )• 7*^")(d) is increasing 
and bounded from above by "y{d). It is easy to see that j^^-^d) — )• 7(d) almost 
surely as a — )• —00. Since 7*-"''((i) G IN for all a G IR and I is continuous, we 
obtain as a — )• —00 almost surely 



7(")(d) 



S, 



(a) 




7(d) 



:(«) 



(a) 



t=l t=l 

Furthermore, the left hand side is increasing as a 



S, 



-00 and 



E 



7(d) 



t=l 



:(«) _ ^(ci) 







where we use that I is decreasing. The positivity of Yg ensures by iGutI (|l988l . 
Chapter II, Theorem 3.1) that i?[7(d)] < 00 and the claim follows for the first 
sum. 

With the same arguments, we have 



7W(d) 



E /(5i'^^-4"M^EK^* 



7(d) 



:(a) 



(a) 



(8) 



t=i 



t=i 



almost surely as a — —00 and the left hand side is decreasing. It is 



E 



y(°)(d) 



t=l 



^(") 



S, 



(a) 



< /(0)E[7(")(d)]. 



The monotone convergence theorem provides limQ^_oo ^[Hf"^] = E[Yt\ > 0. 
Thus, Elxi"^] > for a G R smah enough and the elementary renewal theorem 
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(see e.g. lGutl . ll988l . Section II.4) implies E['j(''\d)] < oo. So, the claim follows 
from dS]) by the monotone convergence theorem. □ 

Choosing l{x) = exp(— ax) with a > yields the following result. 

Corollary 2.7. For a,d > there exists a constant c G R such that for each 
e > there exists ng E M with 

t(— log n+d) 

logn ^ exp(-aS't) £{c-e,c + e) 



-E. 



for all n > uq. 



Proof. By construction we have for — log n + d < a 

^(-){d) ^ ^ 7(d) 

^ exp(-a(5;") - 5^'^))) < j;exp(-a(5i - 5o) 



t=o 



t=o 

7(d) 



t=o 



For e > 0, Lemma 12.61 provides a* G IH such that for all a < a* we have 



E 



7(d) 



t=0 



5]exp(-a(s('^)-Sf) 



E 



lid) 

5;exp(-a [S\^^-Si^^ 
t=o 



< e. 



We choose uq such that — log no + d < a* . Since we have for n > no 

"7(d) 

^exp(-a(5t - So)) 



E^ 



log n 



r(— log n+d) 

exp(-a5t) 

i=0 



n"S_ 



t=0 



the claim follows using Lemma 12.61 once more. 



□ 



3 Recurrences for the random split tree 

We consider a random split tree with the notation as introduced in Section [1] 
and set f„({A;}) := b^P{In,i = k) + ^l^k:=n-so}- This function Un defines a 
probability measure on the set {0, . . . , n — sq}. This is seen by summing up all 
values 

n-SQ 

V M{k}) = 6-i?[/„,i] + ^ 
^-^ n n 

k=o 

_ n- So ^ So 
n n 

= 1. 

For the rest of the paper, we consider the Markov chain {St)t(^M from Section 
[2] where the transition probabilities are given by this special choice of z^. In 
this section, we prove that for this choice the conditions of the Lemmata of the 
previous section are fulfilled. 
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3.1 The distribution of the subtreesize 



3.1 The distribution of the subtreesize 



When doing this, we frequently use the fact that the size of the first subtree 
rescaled properly converges. 

Lemma 3.1. For e > we have 

InA 



P 



V 



n 



>.) <2exp(-^(l + o(i 



In particular, this yields 



E 



InA 



n 



V 



O n-3 



Proof. Starting from the distribution of In,i given in (j3D we obtain by Bern- 
stein's inequality 









-L 




n 







= J P {\Bm{r]n,x) — nx\ > ne) dP^ (x) 



Since it is \In,i/n — V\ < 1, this yields for the expectation 

InA 



E 



V 



n 



E 



< n^t + 2 exp 
O ( 



+ 1 



1 / n^/^ / „ / I 



In, 



n 



V 



4 V \n 



□ 



At this point, we prove some asymptotic expansions needed later. 

Lemma 3.2. For the size of the first subtree In,i in a random split tree with 
splitting distribution V it holds 

2l„2 



E[l']=E[V']n' + o{n 



1 



and 



E[In,i log In,i] = -^n log n + E[V log V]n + o{n) 



£;[/^_ilog/„,i] = E[V^]n^\ogn + E[V^\ogV]n^ + o{n^). 



Proof. It is 



E[Bm{r]r„xf]dP^{x 
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irinx{l - x) + rj^x )dP (x) 



E[V'^]n^ + o{n''). 



(9) 



Furthermore, we have by Lemma 13.11 In 1 1n V va. probabihty. Since x i— )• 
x^ logx is bounded on the interval [0, 1], we obtain for k = 1,2 



E 



jk J 

— r log — 

n 



E[V'' log V]. 



This implies 



E 



III log 



n 



E[V^ log V]n^ + o{n^). 



On the other hand we have 

In.l 



E 



n 



E 



I^log/„,i -E 



log n. 



The claims follow with result ([9]) since we have E[In^i] = (n — SQ)/b. 



□ 



3.2 The Markov chain for the random spht tree 

Now, we consider the Markov chain from Section [2] with the transition proba- 
bilities un{{k}) = b^P{In,i =k) + 

Lemma 3.3. The process {St)t&¥io ^■s 0,1^ AR-process and the corresponding set 
of distributions {F^} fulfills the integrability condition. 



Proof. Since is a probability measure on the set {0, . . . , n — sq} we have 
Yt > for all t. For x = — log n we have by dominated convergence and Lemma 
13.11 for any y G 11 

F^y) = P{Yi <y\So = x) 

k£jN:-log^<y 



V b-P{In,l = k) + —t{n-so>e'Vn} 



bE 



n 



{-log(/„,i/n)<y} 



+ ^1 

n 



{n— so>e 



I^bE[V\{_,,^y^y}] =:F{y). 
Moreover, we obtain with Fubini's Theorem 

poo POO 

/ tdF{t)= / (1-F(i))dt 

Jo Jo 

I" 00 

= / 6i?[Fl{_iogyM}] dt 
Jo 
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= -bE[V logV]. 

This yields < / tdF{t) < oo. 

It remains to show the integrabihty condition, which means 



J tdFait) 



< OO 



for an a G R and Fa{t) := mix<a Fx{t). Using again Fubini's Theorem we 
obtain 



tdFait) = / / l[o,t]iy)dydFait) 



l[y,oo)it)dFa{t)dy. 



Since 



lly,^){t) dFa{t) = hm Fa{z) -Fa{y)<l- Fa{y) 
it follows for a = — log m 



tdFait) < I sup{l - Fx{y)) dy 

x<a 



< bsnpE 

1 n>m 



In 1 

— ^l{-log(7„,i/n,)>;;} 



n 



dy 



<e~y 



< / be-ydy 
Jo 

< oo. 



□ 



Lemma 3.4. The process {St)t<^K fulfills the assumptions of Lemma \2.4\ 



Proof. In the previous proof we have already shown that {St)t&m is an AR- 
process, which fulfills the integrabihty condition. The state space £ = { — log n \ 
n G M} U {1} is discrete. It remains to show conditions ([7]). Let x = — logn 
and y = — log m with m < n. It is 

dTV {Px'^Py') = 2 - 2 min{P,(5i = z),Py{Si = z)]. (10) 
We will show that there exists < a < /3 < 1 such that for n large enough 

L/3nJ+si , 

< min < / 

A;=[an]+si ^''^ 

(11) 



- 1 

A; — Si — 1 



(1-z) 



rii-k+si 



dPUz) I / 



n, m 
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For k = cn + o{n) with c G (0, 1) and n — )• oo we have 



P,{Si = - log k) 

k rjn k- si 



b-P{In,l = k) + — l{fc=„-so} 



k — si n Jq r]. 



f k 

/ ^^P(Bin(?7„, z) = k- si)dP^(z) + ^l{fc=,_,o} 
JO Vn n 



= (1 + o(l))6 / Y, ^ ) /"'^ (1 - z)'>-~''+''dP^{z) + o(l). 
Jo - - 1/ 

Hence, inequahty (jlip and equation (jlOp will imply 

dTv (PfSPjf^) <2-2e 

for some e > 0. The condition |x — y| < X is equivalent to m > e~^n. 
By the general assumption, the distribution of V has a Lebesgue density /y. 
Thus, there exists z £ (0 , 1) with fv{z) > 0. Theorem 3 in Section 1.7.2 of 
Evans and Gariepvl (|l992l l (which is a Corollary from the Lebesgue-Besicovitch 



Differentiation Theorem) implies that we can find a non-empty interval (a, /3) C 
(0, 1) and ei > such that X{{z G (a, /3) | fv{z) < ei}) = with A the Lebesgue 
measure. Now, we can choose some £2 > and K > with a := a + £2 < 
e-^(/3-e2) =:/3. 

We will show that for n large enough, for all k S [an + si, f3n + si] H IN and for 
all I £ [e^^n, n] n IN it holds 



2 n + 1 

First, we consider the function g : z ^ ^k-sif^Y — z)^'-^^^'^^ . Integration by parts 
yields 

^ ^ ^"-(ry, + i)^,U-^i-iy ■ ^ ^ 

For k = cqi + si the function reaches its maximum at i = c, is increasing 
on the interval [0, c] and decreasing on [c, 1]. Therefore, we have for any £3 G 
(0,cA(l-c)) 

Z^^'(l - Z)(l-^)'''dz < 5c(£3)'" 



JO 



/o 

and 



where we set gd^^i) '■= (c — e3)'^(l — c + £3)'-"'^ Stirling's formula yields 

('^'~^^ ' ~ V2vrc(l-c)i((l - c)^--cy^^i = \flJ^U^r^i. 
\cr]i -IJ c " \ c 

Considering the derivative of gc in a neighborhood of 0, we obtain gdx) < 
5(0) < 1 for all X 7^ with |x| small enough. More precisely, for all c G [a,/3] 
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and £3 > small enough we have Qci^s) / gc{0) G (0, C) for some constant C < 1. 
Thus, for £3 > small enough and / large enough we have 



c— £3 



'i\cr]i-lj rji + 1 



and 



c+es 4\cr]i-lJ 77i + l 



Together with (|12p . this implies for some < £3 < £2; ^ large enough and 
c e [a, /3] with ctji £ M 



,_,3 Vc^?i - V ^ ^ - 2 r/i + 1 

We obtain for any k G [dn + si, /?n + si] Pi IN and / G [e~^n, n] n IN when n is 
large enough 



2 n + 1 
This finally yields (jlip : 



- si - 1/ 

J a \k- Si - I J 

1 a 
- 2 '77^ + 1 
1 a 



L/3nJ+si 

min 

fc= [dn] +si 
1 



A; — Si — 1 



'1(1 -2)^'-^+'idP^(z) I / 



n, m|> 



> -ei (^-d) d + o(l) 

> 0. 



As in the proof of Lemma 13.31 we see that 



P.iSriy) -y<K)> inf P^{Si -So<K) 

x<y 



Fy{K) 



y—>-—oo 



> bE 



VI 



{V>e-K} 



Since e ^ < 1 the general assumption Fv{x) < 1 for all x < 1 implies 



bE 



VI 



{V>e-K} 



ished. 



> 0. This shows the second condition and the proof is fin- 

□ 



4 The internal path length 
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4 The internal path length 



After these preliminaries, we are now able to prove Theorem 11.11 To show 
Theorem 11.11 we have to prove that the sequence 



Hn 



E[Pn] — jJL ^nlogn 



n 



con verges. The internal path length P n suffices a recursive representation (see 
e.g. Neininger and Riischendorf . 19991 . equation (50)) from where we get 



E[Pn] = ^^(^-.1 = ^)E[Pk] + 



n - So- 



k=0 



This recursion formula implies 



n-so 



k=0 



Hn=y] yn{{k])Hk + t{n) - -Hn-so 



with t{n) = ^{n — sq — fi~^nlogn + bfi"^ E[In,i logIn,i]) and i^n({^}) as in the 
previous section. 

From the result about the mean of the depth in bevrovel (|l999|) we know Hn < 
Clogn for some constant C > 0. Therefore, we have for any 5i £ (0, 1) 



So 



n 



-so 



< Cso = 0[—^ 



Furthermore, because of n = hE[In.i\ + sO) we have 

1 



t{n) = 1 



^[yiog V] 



E 



In,l 1 In,l 

— log^ 
n n 



+ oi — 

'n 



The function x i— t- xlogx is Holder continuous. Using this and considering 
the rate of convergence of E[\-^ - V\] in Lemma O we obtain with Jensen's 
inequality t{n) = 0{n^^^) for some 62 > 0. Taking all this into account, we 
get 



Hn=Yl ^n{{k})Hk + r{n) 



(13) 



fc=0 



where r{n) = 0{n ^) for some 6 E (0, 1]. 



Proof of Theorem 11.11 Equation (jl3l) shows that the condition of Lemma 
12. H is fulfilled. Thus, we start with the representation of 



E[Pn] — /U ^nlogn 



n 



from there and show that {Hn)nG'iN is a Cauchy sequence. Let e > be given. 
For the second term in ([5]) we keep in mind that we have already shown |r(n)| < 
Cn~^ for some constant < C < 00 and 6 G (0, 1]. We define / : R — )■ 1R+ by 
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l{x) := exp((5x). As in the proof of Theorem 4.2 in Bruhn ( 19961 ) we obt 
with Lemma 12.51 for ni G IN with — logni < 

(T(ni) — 1 a{ni) — l 

E-\ogn ^ r(exp(-5i)) <£'-iog„ ^ Cl{St) 

t=0 t=0 

\- logni] 

< Cn(a*) sup l{t) 



am 



< Cn(a*) 



n=-oo tG(n-l,n.] 
logni] 



/(t + l)dt. 



Since l{t)dt < oo we can choose ni £ IN such that we have for ah 71,171 > ni, 

- a{ni)-l 



log n 



r(exp(-5t)) 



i=0 



< 



Considering the first term in ([5]), we set 

a(ni,n) := £^_iogn^fcxp(-S',(„j)) 

and claim that there exists no such that for ah n,m> uq we have \a{ni,n) 
a{ni,m)\ < s/2. It is 



|a(ni, n) — a{ni,m)\ 



£^-logn-f^cxp(-S,(„j)) - E- 



H. 



cxp(— a;) 



— log n — log m 



;r7i-f^oxp{-5„(„^)) 

(dx) 



< dTV ( ^-logn'^-logm 



sup Hk. 

fcG{0,...,ni} 



Since ni is fixed we have sup;jg|o,...,ni} \^k\ < C < 00 with some constant 
C G H. Lemma 12.41 in combination with Lemma 13.41 yields the claim. 
Taking everything into account, we obtain for all n, m > max{no,ni} 



\Hn-Hm\ < \a{ni,n) - a{ni,m)\ + 

-(T(ni)-l 



log n 



o-(ni)-l 



r{exp{-St)) 



t=o 



+ 
< e. 



E. 



log m 



Y r(exp(-5j)) 



L t=o 



This shows that {Hn)n£M is a Cauchy sequence and thus it converges. □ 



Proof of Corollary 11.21 Parts a), c) and d) of Corollarv 11.21 are i mmediate 
consequences of Theorem 11.11 and iNeininger and Riischendorf I1999I . Theorem 
5.1). To prove part b), we use that convergence with respect to the ^2-metric 
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implies convergence of the second moments. Thus, we obtain as consequence of 
part a) Hmn^oo -E^[^^] = Using the distributional fixed point equation 

characterizing X, we have 

21 



E 



\k=l ^ fe=l / 



+ E 



k=l 



'^ + -Y^Vk\ogVk + — l^Vk logV, 



k=l 



\k=l 



where we used the independence between {Vi, . . . , V5) and {X^^\ . . . , X^'^^) as 
well as the fact that E[X^^^] = for all k. Since n = -bE[Vi log Vi\ for all 
z = 1, . . . , 6 and E[X^] = E[{X(^^f] =: the claim follows. □ 



5 The Wiener index 

We now turn to the investigation of the Wiener index. To handle the Wiener 
index similarly to the internal path length, we first need a recursion formula for 
it. The Wiener index is the sum of the distances between all unordered pairs of 
balls in the tree. Let ^ denote the distance between the balls k and I. Then 
we have 

Wn = J2Ak,l. 

k<l 

Subdividing the sum into the sum for all pairs, where both balls are located in 
the same subtree, and the sum for all other pairs, we obtain 

1=1 i<j leT„j keT„,i 

(i) 

where WjJ. denotes the Wiener index of the z-th subtree r„_j being of size In,i- 

For k e Tn,i and I G T^j with i 7^ j it is A^,, = d'^^ + 1 + + 1 where D^^^ is 
the depth of the ball k with respect to the subtree Tn^i- By symmetry of A^ ; 

we can sur 
we obtain 



(i) 

we can sum up only the first part + 1 but for all ordered pairs of balls and 



EE E am = EE E(4'^ + i)- 

The summation over k G Tn,i yields 

EE E(4^^ + i) = EE<+^n.) 



(i) 

where PjJ^ denotes the internal path length of the z-th subtree r„^j. Since there 
are all together n — In,i balls not lying in T„^j, we finally obtain the recursion 
formula for the Wiener index of the random split tree with n balls: 

b 

Wn = Yl [wit + - + ^nAn - ^,^)] • (14) 

i=l 
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Proof of Theorem 11.41 Starting from equation (jl4p and taking the expecta- 
tion yields 

n-so 

E[Wn] = b ^ P(/„,i = k) {E[Wk] + {n- k)E[Pk] + nk - k") (15) 

fc=0 

because all subtrees are identically distributed. Theorem 11.11 implies E[Pk\ = 
j^k log k + Cpk + o{k). Substituting this in p3]) yields with i] = n/b + o{n), 

n-so 

P(/„.i = k)E\Wk] + 
+ (cp + l)n2 - {cp + l)bE[ll^] + o(n2). (16) 
Substituting the results from Lemma 13.21 in (fT6]l provides 



n-so 

E[Wn] = b P{In,l = k)E[Wk] + -b {nE[In,i log/„,i] - P[/2 1 log/„,,i] 



E[Wn] = = k)E[Wk\ + -(1 - bE[V^])nHogn 

- (^ElV"^ logV] + bEyv"^] - cp{l - bE[V'^])^ + o{n^). (17) 

We set 

E[Wn] - logn 

TT M 

tin ■— • 

n 

To prove Theorem 11.41 it suffices to show that for each e > there exists a 
constant c G E, and no G IN such that for all n > uq 

— e (c — e, c + e). 
n 

So, let e > be given. Substituting in pT]) and using Lemma [22] yields 

i?n = X] '^n({fc})^^fc +r(n) 
A;=0 

with 



r(n) := - {bE^'^\ - Cp(l - bE^'^])) n + o(n). 

We set d := -^^[F^] + Cp(l - ^^[y^]). As in the proof of Theorem O the 
conditions of Lemma 12.11 are fulfilled and we have the representation 

o-(ni)-l 

iJ„ = ^_iog„i?exp(-5,(„^)) +^-iogn Yl r{exp{-St)). (18) 

t=0 

We start again with the second term and split it in the following way 

(T(ni) — 1 t(— logn+d) 

-E-iogn Y ^(.^^P(.-^t)) = E-\ogn Y r{exjp{-St)) 

t=0 t=0 
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(T(ni)-l 

+ -E-iogn ^ r(exp(-S't)). 

t=T(- log n+d)+l 



For the second summand we obtain by Lemma 12.51 with l{x) := dexp{—x) and 
ni large enough such that — log ni < a^: 



< 



(T(ni)-l 

-E^-iogn ^ r(exp(-5t)) 

t=T(- logn+d)+l 



t£{n—l,n] 



I- logni] 

< n(a=K) ^ sup |d|e~* 

n= [— log n+d\ 

/-logni 
e"^ dj; 
- log n+d— 3 



with some constant C. We choose d large enough, such that Ce '^^^ < e/3. 
For this d Corohary 12.71 yields no G IN such that for all n > no 



n 



r(— log n+d) 

;n XI '^(exp(-5't)) G (c- -,c+ - 



t=0 



(19) 



for some constant c. As in the proof of Theorem 11.11 the first summand in (jlSp 
is a Cauchy sequence, i.e. there exists ng G IN such that for all n > no we have 



n 



^_log„[i?cxp(5,(„j))] 



< 



Altogether, we have seen that for ni G IN with — log ni < a^, there exists no G M 
such that for all n > no we have 



-Hn ^ 1 

n n 



^ (T(ni)-l 
S_logn^exp(-5.(„^)) + -^-logn ^ r(exp(-St)) 

G (c — e, c + e) 



with the constant c in ()19p . Thus, the claim follows. 



□ 



Proof of Theorem 11.51 We define 



1 



Wn ■= E[Wn] = —n logn + c^n + o(n ), 
Pn ■= E[Pn] = —n log n + c„n + o(n) 



and 
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For i G {1, . . . , 6} let Xn^ be an independent copy of Xn. Since the subtrees of 
the random spht tree are independent conditioned upon there sizes, we obtain 
from (jl4p for the standardized vector Xn the following recursion formula 



with 



^ 
i 



i=l 



1 - In,i 

1 



/2 . 

n,i 





and = where 

fe b b \ 

i=l 1=1 i=l J 



2i 2 , f 2\ 

n log n — c^n + o(n j 



and 



1 1 

ftW - 1 _ _ logn - Cp + o(l) + - J^p/^., + 0(1). 



Using Y,i 

= n — So it follows 



n 



1 1 ^ / • 

» log n = n- /„ j log — + Cpn{n - sq) + o( 



n 



and 



b b 



1=1 i=l 



In,iPl„^, = {cw - Cp) ^ + o(n2 



i=l 



This yields with = o(n^) 



fei = - > — log — + 1 + Cj 
/i -f— ' n n 



^ i=i 



i=l 



(20) 



By similar arguments we have 



,(n) 1 In,i i -^n,i , -, , /-,\ 
Oo — - > ^ log + 1 + o(l). 

^ i=i 



(21) 



In order to use the contraction method as in iNeiningerl (j200ll . Theorem 4.1) it 
suffices to show that for n — )• oo 



(22) 
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E 



{In,i<l}^{In,i=n} 



(^(n))T^(n) 



op 



for alU G IN and 



E^II(A*)^A*llop<i 



(23) 



(24) 



i=l 



where 



lop 



is the operator norm. 



By Lemma [3. II we know that In/n converges in probabihty to V := (Vi, . . . , H), 
which is the splitting vector. By equations (j20p and (j2ip we have fe*-"") — >■ h* in 
probabihty as n — t- oo with 

= i g F. log V. (;) + ((' + =^ - (i - . 

By the boundedness of the function x i— )• xlogx on [0, 1] and as In^ijf^ G [0, 1] 
there exists a constant C such that 



in) 



< c 



and 



An) 



< c. 



Thus, we get the uniform integrabihty of (b^^)'^ and (b^^)"^ and consequently 
the convergence of 6^"^ with respect to the i2-ioaetnc. Similar arguments yield 
the convergence of A'f'^ with respect to the ^2-metric to 

v,{i-Vi)] 



A* 







Condition ()23p follows from the deterministic boundedness of ||j4^"'^ ||op and from 



This shows condition ( I22D . 
Condition ([21 
the fact that 

lim P < /} U {In,i = n}) 



lim f P{Bm{r]n,x) < I - si)dP^ (x) 



< lim P y < ((/-si)/7?„)3 



+ lim 



1 exp I -^Vn{l - Si) 3 M 



2\ 2N 
/ - SlN 3 



dP^(x) 



where we used Bernstein's inequality. 

It remains to show ()24p . Solving the characteristic equation for the matrix 
{A*)'^Al we obtain that its eigenvalue X{Vi) being larger in absolute value is 
given by 

X{Vi) = (l - V, + + (1 - v,)^/vf+i^ . 
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Elementary calculations show x>x^(l — x + x^ + (l — x)\/x'^ + 1) for all x G 



(0, 1). Thus, we have E[X(yi)] < E[Vi\ 
This finally implies 



1/6 because it is P(yi G {0, 1}) = 0. 



E 



Ell (A* 



A* 



lop 



1=1 



E 



< 1. 



The claim for the asymptotic behavior of the variance of Wn follows directly 
from the first part, since convergence with respect to the ^2-iiietric implies 
convergence of the second moments. □ 
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A Proof of Lemma 12.41 



We give the essential parts of iRosled (Il99ll ) which prove Lemma 12.41 
Proof of Lemma 12.41 Let a G Il_ . We use the notation 
A(a) := lim sup ^tv f i^i^''"' , if 

^0->—oo x,y<xo ^ 

Since the function 

xo ^ sup (Itv (Px^''"^ , Pjf^'"' 

x,y<xo ^ 

is increasing and non-negative, the limit for xq — )• — oo exists. We wih show 
that A(a) < (1 — e)A(a) + 5 for some e > and all 6 > 0. Then the claim 
follows. 

Let 5 > be an arbitrary number. Since the process S fulfills the integrability 
condition and 5'^(y) — y < 'S'T(y) ~ 'S'T(y)-i) there exists xi € IR_ such that for 
all X < y < xi 

Ex[Sriy) -y] < j Z dFyiz) < C<00 

for some constant C. Thus, there exists Ki > K such that for all y < xi 

ft(S„„-.>KO<^^^!%f^<j. (25) 
Furthermore, we have for this Ki 

sup dTY f , ) < sup dTV f , ^'^'"^ ) • (26) 



y+K<z<y+Ki ^ ^ u,v<y+Ki 

The distribution of the Markov chain S on the state space £ is given by the 
kernel 

k{x, A) := P{St+i £A\St = x) for all t G Mq and vl C £. 

Let be the process S stopped at the moment when it exceeds a ^ £. The 
kernel Kq corresponding to the process S'*-'^-* is then given by Ka{x,A) = k{x,A) 
for X < a and Ka(x, A) := 1^(2;) for x > a and for all A C £. 
Let D := {{x,x) \ x € £} denote the diagonal in We d e fine a ke rnel g on 
£ by the so called Wasserstein coupling (see e.g. iGriffeathl . I1974/75I I. i.e. for 
(x, y), {u, v) £ £'^ it is 

_ jmm{Ka{x,u),Ka{y,v)}, if u = V 

Q[[X,y),[U,V)) :- < (^^^(^x,u)-Ka(y,u))+{Ka{y,v)-Ka(x,v))+ -r, 

I l-a(a;,j/) ' ^ i= V 

where a(x,y) := ^^g^- min {^^(x, z), Ka(y, z)} and r+ = max{r, 0} denotes the 
positive part of a real number r. Then the following properties hold: 

a) q{{x, y). Ax £) = Ka{x, A) and q{{x, y),£ x A) = Ka{y, A) for all x,y £ £ 
and Ac £ 
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b) q{{x, x), D) = 1 for all X £ £ and 

c) qHx, y),D'^) < 1 — e for all X, y G <5 with \x — y\ < K and x,y < xq. 
The property c) follows from the assumption ([7]) and the fact that 

zeE 



2(1-^ min{Ka(a;, z),Kaiy, z)} | . 

V z€E J 



For € S'^ let Z^^'?') = ([/(^'.y), ^(^■-2')) be the Markov chain generated by 

the kernel g which starts in {x,y). We define the stopping time 

9{a) := inf{t | Z^""'^^ e (a,oo) x (a,oo)}. 
Using this coupling we obtain for any K2 > and z,y < a 



P U, 



J2 \^^(^r{a) =W) - Py{Sr{a) = w)\ 

E 



(n,7;) 



P ( [/r-^ = w I Z^^) = (n, z;)) - P (yjj)^) = w \ zj^'^^ = (n, t;)) 



r(2:J/) 

(a) 



= -P«("S'T(a)=«') 



= Pv{S^(a)=w) 



u,v<y+K2 

(27) 

In the last step we used that Pu{S^(^a) = w) — Pv{S^(^a) = w) = iov u = v. As 
seen in equation ([25]) and using property a) of the coupling, there exists by the 
integrability condition K2 > K such that for all y < xi — and y < z < y + K 

P (zi"'^^ ^ (-00, y + < {z, (-00, y + K2r) + (y, (-00, y + Ka]'^) 



6 

< -. 

- 4 



(28) 



After these preliminaries, we now turn to A(a). It is for x < y < a — K 



[y,y+K] 
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+ 



+ 



(y+K,y+Ki] 



{y+Ki,co) 



<PASriy)-y< K) sup dTV , Py^'"' 

y<z<y+K ^ 
+ P^iS^f^y) - y > K) sup dTV (^f , ) 

y+K<z<y+Ki ^ ' 

With the results in ([25]), ([26]), ([27]) and ([28]) as well as property c) of the kernel 
Q this finally yields 



A (a) < lim sup 



Px{Sr(y) - y < K) sup d^y(Pu^'-^\p!^^^^){l-e) 



+ -y>K) sup dTV (Pf , 



+ 2P (zj^'^) ^ (_oo,y + E:2 



+ 



<A(a) lim sup (l - eP,(5,(j,) - y < ET)) + 5 
< (l-e)A(a) + (5 

where e = e\\mx^^_ao''^'^^x<y<xo Px{Sr{y) - y < K) > 0. 



□ 



B Proofs from iBruhnI ( 119961 ) 



Proof of Lemma l2.1l For n < ni the claim follows immediately since o"(ni) 
0. For n > ni equation ([5]) follows by induction on n. It is with -f^i/e := Hq 



Hn+i = Vn+i{{k])Hk + r{n + 1) 



fe=0 



Y P~ log{n+l) {Si = - log k) log ] 



k=l 



(T(ni)-l 



t=0 



+ -E'_iog(n+l)[»^(exp(-So))] + -P- log(n+l) ('S'l - [-f^cxp(-S^(„j))] 

(T(ni)-l 

log(n+l) 



£^-log(n+l)-f^cxp{-5,(„^)) + 



^ r(exp(-S't)) 



i=0 



where we use the Kolmogorov-Chapman equation for Markov chains in the last 
step. □ 
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Proof of Lemma 12.51 We use the notation from Section [2] and define for 
X G IR_ the function Ux by 

Ux{a) ■.= Ex[\{t : St G (a,o+l]}|]. 

By the monotone convergence theorem we have Imia^^oo E\Y}f^\ = E\Yt\ > 0. 
Thus, there exists a^, G K, such that for all a < a* it is > 0. For 

x,n,a < a* and A: G M it holds 



{Sk-i<n) dpf("-^'(y) 



Px{\{t:Ste{n-l,n]}\>k)= [ Py 

J (n— l,n] 
J (n— l,nl 



•S'T(n-l) 



(y) 



< 



[ PoiS^l, < 1) dP. 

J {n—l,n] 



Thus, we have 



< Pois}:\ < 1) 

= ^o(|{t:5;'')G[0,l]}|>A;). 



k=l 

= Eo[\{t:Si''^ G[0,l]}\] 
= : u{a). 

Since it is E'fy^"^] > the elementary renewal theorem (see e.g. Gut . 19881 . Sec- 
tion II. 4) provides u{a) < oo. Furthermore, the function a i— )• u{a) is decreasing 
as a — >• — oo, i.e. u{a) < n(a*) for all a < a*. 

So we finally obtain for a function / : R — )■ y,z £ R and x £ £ with 

X < y < z < a^: 



Ex 



t=r{y) 



n=ly] 



< ^(a*) sup l{t). 

n=\y-\ *e(n-l,n] 



□ 



